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Abstract 

The paper is concerned with the classical occupancy scheme with infinitely many boxes, in which n 
balls are thrown independently into boxes 1,2, . . ., with probability pj of hitting the box j, where 
pi > P2 > • • ■ > and ~ 1- We establish joint normal approximation as n ^ cxd for the 

numbers of boxes containing ri, r2, . . . , balls, standardized in the natural way, assuming only that 
the variances of these counts all tend to infinity. The proof of this approximation is based on a 
de-Poissonization lemma. We then review sufficient conditions for the variances to tend to infinity. 
Typically, the normal approximation does not mean convergence. We show that the convergence of 
the full vector of r-counts only holds under a condition of regular variation, thus giving a complete 
characterization of possible limit correlation structures. 

1 Introduction 

In the classical occupancy scheme with infinitely many boxes, balls are thrown independently into boxes 
1,2,..., with probability pj of hitting the box j, where pi > P2 > • • ■ > and X^^i Pj ~ 1- The most 
studied quantity is the number of boxes Kn occupied by at least one out of the first n balls thrown. It 
is known that for large n the law of is asymptotically normal, provided that Var[/^„] oo; see [6l[7] 
for references and a survey of this and related results. In this paper, we investigate the behaviour of the 
quantities Xn^r, the numbers of boxes hit by exactly r out of the n balls, r > 1. 

Under a condition of regular variation, a multivariate CLT for the Xn^s was proved by Karlin [5]. 
Mikhailov [12j also studied the Xn^s, but in a situation where the pj's vary with n. In this paper, 
we establish joint normal approximation as n — > cxd for the variables Xn^n , • ■ • , Xn^r,„ , centred and 
normalized, assuming only that lim„_^oo Var X„ .^^ — oo for each i. We also give examples to show that 
this condition is not enough to ensure convergence, since the correlation matrices need not converge as 
n oo. The asymptotic behaviour of the moments of the Xn.r is thus of key importance, and we discuss 
this under a number of simplifying assumptions. 

The behaviour of these moments, as also of those of Kn — X^^^i ^n.r, depends on the way in which the 
frequencies pj decay to 0. In the case of power-like decay, pj ~ cj^^/" with < a < 1, it is known that, 
for each fixed k, the moments ^ have the same order of growth with n for every r, and this is the same 
order of growth as that of Eif^; moreover, the limit distributions of Kn and of X„ :— {Xn,i, Xn,2, • ■ •) are 
normal [HI [5] ■ In contrast, for a sequence of geometric frequencies pj = cq' {0 < q < 1), there is no way 
to scale the Xn,r's to obtain a nontrivial limit distribution [TD], and the moments of Kn have oscillatory 
asymptotics. In a more general setting such that the pj 's have exponential decay, the oscillatory behaviour 
of Var[if„] is typical [5]. The spectrum of interesting possibilities is, however, much wider: for instance, 
frequencies pj ^ ce~^ , with < /3 < 1, exhibit a decay intermediate between power and exponential. 

Karlin's [8 multivariate CLT for Xn applies when the index of regular variation is in the range 
< a < 1. We complement this by the analysis of the cases a = and a = 1, showing that for each 
a e [0, 1] there is exactly one possible normal limit. Finally, we prove that these one-parameter normal 
laws are the only possible limits of naturally scaled and centred Xn ■ Specifically, we show that a regular 
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variation condition holds if Yai Xn,r ^ oo for all r and if all the correlations {Corr (X„^r, -^n,s), J", s > 1} 
converge. 



2 Poissonization 

As in much previous work, we shall rely on a closely related occupancy scheme, in which the balls are 
thrown into the boxes at the times of a unit Poisson process. The advantage of this model is that, 
for every t > 0, the processes {Nj{t) , i > 0), counting the numbers of balls in boxes j — 1,2,.. ., are 
independent. Let Yr{t) be the number of boxes occupied by exactly r balls at time t. In view of the 
representation 

oo 

Yr{t) = Y,\[N,{t)=r] (2.1) 

with independent Bernoulli terms, it follows that 

Y;{t) := {Yrit)-E[Yrit)])/^Ya.T[Yrit)] -^a AA(0, 1) as t ^ oo (2.2) 

if and only if Var[yr(i)] ^ oo. This suggests that normal approximation can be approached most easily 
through the Yr{t), provided that the de-Poissonization can be accomplished. We now show that this is 
indeed the case. 

Let £(•) denote the probability law of a random element, c?tv the distance in total variation. 
Lemma 2.1 For any m,k E N satisfying m < ^npk, we have 

dTv{L{Xn,u...,Xn,m),L{Yx{n),...,Y^{n))) < 7rfc + 2fce-"P'=/i°, 
where tt^ := J2Zj+iP^- 

Proof. We begin by noting that, in parallel to (|2.ip . 

oo 

X„,, :-^l[M„,, -r], (2.3) 
i=i 

where M„j represents the number of balls out of the first n thrown that fall into box j. Our proof uses 
lower truncation of the sums (|2.ip and (|2.3p that define Yr{n) and Xn^r- 

Since Mnj ^ Binomial(7i,pj), it follows from the Chcrnoff inequalities [5 that, if m < ^npk, then for 
J<k ' 

P[M„j<m] < P[M„j < inpj] < exp{-npj/10} < exp{-npfc/10}, 

since the pj are decreasing, and m < ■^np^; and the same bound holds also for Nj{n) ^ Poisson(npj). 
Hence, defining 

oo oo 

Xn,k,r-= '^\^n,j=r], Ffe,.(t):= l[^j(*) = '^]' 

j=k+l j=k+l 

it follows that 

dTvmXn,i,...,X„,m),C{Xn,k,u---,X„,k,rn)) < kc-^'^" ^ '° ; (2.4) 
dTv{CiYi{t),...,Ym{t)),CiYkAt),.-.,Yk,mit))) < ke-'P"/^". (2.5) 
But now, from an inequality of Le Cam ^4J and Michel [11] , we have 

dTv(/:(iVj("), j>fc + i),/:(Af„,,,j>fc + i)) < TTfe, (2.6) 

and the Xn^k,r are functions of {Mnj, j > k + 1}, the Yk^r{n) of {Nj{n), j > k + 1}. The lemma now 
follows from 1231), (ESI and (EH). ' ' □ 
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Proposition 2.2 Let k{n) he any sequence satisfying 

k{n) oo and fc(n)e-"P'=("'/^° 0. 

Then, for any sequence m{n) satisfying m{n) < for each n, it follows that 

dTv(/^(^n,i'---'-'^«,m(«)),'C(yi(n),...,y,„(„)(n))) 0. (2.7) 

Proof. Since m(n) < ^npfe(„) for each n, it follows that Lemma 12.11 can be applied for each n. Since 
k(n) oo, it follows that ^^(n) ~^ 0, so that the first element in its bound converges to zero; the second 
converges to zero also, by assumption. □ 

Remark. Such sequences k(n) always exist. For instance, one can take 

k(n) = max{/c: 20logk/pk < n}. 

For this choice, it is immediate that k{n) oo, and that np^n) ^ 201ogA:(n) oo, entailing also that 
A:(n)e~"P'=(")/^'' < l/k{n) 0. Hence there are always sequences m{n) oo for which (|2.7p is satisfied. 

Hence, in particular, any approximation to the distribution of a finite subset of the components 
of Y{n) = (Yi(n), Y2(n), . . .) (suitably scaled) remains valid for the corresponding components of X„, at 
the cost of introducing an extra, asymptotically negligible, error in total variation of at most 

7r,(„)+2A;(n)e-"f'=(")/io, (2.8) 

where k{n) is any sequence satisfying the conditions of Proposition 12.21 



3 Normal approximation 

As noted above, the distribution of Yr{t) is asymptotically normal as t — > oo whenever Yar Yr{t) — > oo. 
Here, we consider the joint normal approximation of any finite set of counts Yr-^{t), . . . ^Yr^X^) such that 
ri > 1 and limt_+oo Var Yr. (t) = oo for each 1 < i < m. We measure the closeness of two probability 
measures P and Q on R™ in terms of differences between the probabilities assigned to arbitrary convex 
sets: 

dc{P,Q) sup|P(A)-Q(A)|, 
Aec 

where C denotes the class of convex subsets of W" . Let 

$,(i) := EYrit), Vrit) := Vary,(t), a.(<) Gov (y,(i), y,(t)) 
denote the moments of the Yr{t), and let 

^rs{t) := Crs{t)/y/Vrit)Vs{t) = CoY [Y^ (t) X (t)) 

denote the covariance matrix of the standardized random variables Yj^{t) as in (12. 2p . 

Now the random vector (Y^ {t),...,Y^ (t)) is a sum of independent mean zero random vectors 

iY{„^{t), . . .Xrjt)), I > 1, where Y^t) Z {l[Ni{t) = r] - pi^t)) I ^/Vri^, and 

PI At) := =r]= e-*P' (3.1) 

r! 

A theorem of Bentkus T, Thm. 1.1] then shows that 

4(/:(>;;(t),...,i;'„(i)),MVN,„(0,Sfi(f))) < Cm^'^Pu 

for an absolute constant C, where 

Pt ■■= ^/3m and Pt,i E|S^^/'(t)(l^;,^ (t), . . . , F^;,^ (t))^|3, 
;>i 

and S_R(i) denotes the m x m matrix with elements {T,rs(t), r, s G i? := {ri, . . . ,?■,„}}. Applying this 
result, we obtain the following theorem. 
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Theorem 3.1 // limt^oo Vriit) = oo for each 1 < i < m, where 1 < ri < . . . < rm, then, as t and n 
tend to oo, 

de(/:(>;;(i),...,i;'„W),MVN„(0,EH(t))) = 0(1/ mm ^/KM) ^ 0; 

4(/:«,,,,...,^;,,„),MVN„,(0,E,^(n))) = 0(7r,.(„) +2fc(n)e-"''M.)/i" + {l/^mm^v/^^ 

0, 

where k{n) is any sequence chosen as for Proposition 12.21 and satisfying maxi<j<m ''j < \npk{n) for 
each n. If, in addition, — > as t ^ oo, for some fixed 'Sfj, then 

iY;^{t),...Xjt)) -^d MVN„(0,Sfl) and {X'^^^^, . . . , X^.J -^d MVN™(0,S«). 

Proof. All that we need to do is to control the quantity /3t. This in turn involves bounding the smallest 
eigenvalue of S_R,(t) away from 0. Now direct calculation shows that, for any column vector a £ K™, 



a^SK(i)« = Var(^a,y;^.(t)) = ^ Var a.F;;,/*) 
j=i i>i i=i 

Using the definition of Y;'^(t), this gives 

where p^'^'{t) := X^rei? ^','•(0 arid, under the measure p''^'*^ U takes the value Oj/ ^Vr^ (t) with proba- 
bility pi^rj{t)/p'''^{t), i < j < rn. This in turn implies that 

and since 



-^^pi,nit)Vr^(t) 

it follows that 

;>i j=i ' 



a^^R{t)a > ^(1-/^(0)^ ^'^-^ ; -'' > min(l-/«(0)a^a, 



since < J2i>iPLr{t)- However, for each /, p'^-^(i) < 1 -p;,o(i) - Ej>r,„ -P'j (^)' ^'I'i {Pi,r{t), r > 1) 

are just the Poisson probabilities (|3.ip . Hence 1 — p'-'^{t) > e^^ if tpi < 1, and 1 — p'''^{t) > q{rm) :— 
Poisson(l){[r„j + l,oo)} if tpi > 1, implying that 

min(l - > cr := min{e~\g(m)} > 0, 

for all t. It thus follows that a'^T,ji{t)a > CRO^a for all a G M'". 

It is now immediate that, for any x e K™, {t)x\ < c^^^^\x\, and hence, since — 

a.s., we have 



\^-R'\t){Y,[^^{t),...,n^jt)f\' < c-^' 



mini<i<m yJVnit) 



taking expectations and adding over / > 1 gives /3t < (m/cfl)'^/-^/ mini<i<m yJVn (t), proving the first 
statement of the theorem. The second follows in view of (12.81). □ 



Thus multivariate normal approximation is always good if the variances of the (unstandardized) compo- 
nents Yr{t) are large. However, convergence typically does not take place: see a series of examples in 
Proposition 14.41 below. 
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4 Moments 

For normal approximation, in view of Theorem 13. 1[ we are particularly interested in conditions under 
which Vr{t) oo. 

For the moments we have the formulas 

oo 

= Y.p,,r{t), (4.1) 

i=i 

°° /2r\ 
Vr{t) = = $,(<)- 2-2'- f J $2^(2t), (4.2) 

j=i V ' / 

Crs{t) = -2-^'-'('^^']^r+sm, r^s, (4.3) 



where, as above, pj^r — e {tpjY /r\. 
From (14.11) and (14.211 we obtain 



r 



$,.(t) > Vr{t) > kr<i>r{t), 

with fcj. > 0, as is seen from the inequalities 

e X e r 
1 > 1 — > 1 — > 0. 

for X > 0. It follows that 

Vr{t) — > OO ^^=^ ^r{t) oo; 

hence, as long as only the convergence to infinity of Vr{t) is concerned, we can deal with the simpler 
quantity ^r{t)- This facilitates the proof of the following theorem, showing how the asymptotic behaviour 
of Vr (t) for different values of r is structured. 

Theorem 4.1 The asymptotic behaviour of the quantities Vr{t) as t —> oo follows one of the following 
four regimes: 

1. limt^oo Vr{t) = oo for all r > 1; 

2. limsupf_^g^ Vr{t) = oo for all r > 1, and there exists an rQ > 1 such that liminff^oo Vr{t) ~ oo for 
all 1 < r < ro, and liminff^oo Vr{t) < oo for all r > ro,' 

3. limsupj^oQ Vr{t) — oo and liminft^oo < oo for all r > 1; 

4. supj Vr(t) < oo for all r > 1. 

Proof. Replacing Vr with for the argument, the formula (|4.ip yields 



'frit) = E^"'^^; 'J'«(V2) = E 



r\ 



For s < r, the ratio of the individual terms is given by 

e-'P^/Htp,/2y/s\ /2 r! 

— ; r — — — > mmje*" y ^ 'l—r^ — t i ■ 

e-*Pi{tpjY/rl ~ v>o^ " ' s\2^ \r ~ s J s!2'^ 

Hence, for all s < r, 

*4V2) > 'fr(t) (^)' (4.4) 

It now follows that if, for some r, limj^oo Vr{t) — oo, then limj^oo ^s(^) = oo for all 1 < s < r also; and 
that, if supt Vr{t) < oo for some r, then sup^ Vs{t) < oo for all s > r. Hence, to complete the proof, we 
just need to show that, if sup^ Vr{t) < oo for some r > 1, then sup^ Vi(t) < oo. 
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For this last part, write $r(i) = Lr{t) + Rr{t), where 

Lr{t) := ^ e-'P^{tp,r/rl; R^t) ^ e-'^^ {tp.Y /r\ . (4.5) 

j:tpj>l j--tpj<l 

Suppose that supj ^r{t) ^ K < oo. Then, for every t > 0, 

Li{t) < r\Lr{t) < r\^r{t) < K r\ . (4.6) 
It thus remains to bound Ri{t), which in turn can be reduced to finding a bound for 

S{t):= J2 

j:tpj<l 

Let ao > ai > . . . > be any decreasing sequence such that aj/aj+h > 2 holds for some h > 1 and 
all j > 1- Then aih+m < a,„2~* for every i > and < m < h. Splitting the Oj's into h subsequences 
that are dominated by the geometric series, we thus have 

h 

^ ^ aj < ^ ] '^O'm ^ 2aQh. 

j>0 m=0 

Now if, for some h> 1, the frequencies pj satisfy 

Pj/Pj+h>2 for all j>l, (4.7) 

then applying the above result to the sequence aj = ipj+min{j::tpi<i} for any t yields the bound Ri{t) < 
S{t) < 2h, since ao < 1. 

On the other hand, if < 2 for some j and h, then it follows frompj > Pj+i > . • . > Pj+h > Pj/2 

that 

Li(2/p,) > ^e-^P-^Mf^ > e~\h + l). 

Thus, for any h such that e"^(/i + 1) > Krl , we see that (|4.7p must hold, since otherwise (|4.6p would be 
violated for t = 2/pj. Hence it follows that Ri{t) < S{t) < 2e^Kr\ , and the final part of the lemma is 
proved. □ 

In particular, in Theorem l3.1[ the quantity mini<i<„i \/Vr~{t) can thus be replaced in the error estimates 
by$,,„(2t). 

We now turn to finding conditions sufficient for distinguishing the asymptotic behaviour of the Vr (t) . 
To do so, introduce the measures 

oo 

i^ridx) = ^p^Jp^(dx). 

Two special cases are vq, a counting measure, and vi, the probability distribution of a size-biased pick 
from the Pj's. For r > write (|4.ip as 

^M) = - / e-*^a;Vo(dx) = - / e-*^i/^(da;) = — / e-*^i^JO, d dx. (4.8) 
Jo rl Jo rl Jq 

Comparing with standard gamma integrals, it is then immediate that 

liminf^^^fei < liminf$^(i) < limsup$^(t) < lim sup ^^^^^^ . (4.9) 

x^O t^oa f^oo j.„,0 X"^ 

This, together with Theorem 14.11 enables us to conclude the following conditions for the convergence to 
infinity of $r(i)) and hence equivalently of Vr{t), expressed in terms of the accessible quantities 



Pj,r 



1 °° 
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Lemma 4.2 

(a) supog < oo for all s > 1 if and only if for some (and then for all) r > 1, sup^ pj,r < oo. 

(b) //, for some r > 1, limj^oo Pj.r = oo, then linif^oo ^s{t) — oo for all 1 < s < r. 
Proof. If pj+i < X < pj then 

_ iyr[0,Pj+l] _ l^r[0,x] iyr[0,x] I^r[0,Pj + l] _ . , 

Pj.r — r ^ r < ; - r ^ ^+Pj+l,r- 

Pi P) ^ pUi 



Hence (|4.9I) can be replaced by the inequalities 

liminf/9j> < lim inf <i>r (t) < linisup<i>r(t) < 1 + lim sup p^.r- . (4-10) 

Part (b) of the lemma now follows directly from Theorem 14. II 

For part (a) , much as for the last part of the proof of Theorem 14.11 define 

h{j) ma.x{l > 0: Pj+i/pj > 1/2}; h* snp h{j). 

j 

Then it is immediate that 

2"''ft(j) < Pj,r < /i*^2-('-i) = 2/i*, 
i>i 

so that h* < oo if and only if sup^ pj_r < oo for some, and then for all, r > 1. We now conclude the 
proof by showing that supog ^s(t) < oo for all s > 1 if and only if h* < oo. Defining Lr{t) and Rr{t) as 
in (14.51). we observe that, if h* < oo, then 



e '-^ 

r\ 

i>i i>i 



Rr{t) < /i*^2-''('-i) < 2h* and Lr{t) < /i* ^ e — , 



so that ^r{t) — Lr{t) + Rr{t) < OO for all r > 1. On the other hand, 

Lr{l/Pj+h{j)) > e^^h{j)/r\, 
implying that, if h* = oo, then limsupf^o^ 'l'r(i) = oo for all r > 1. □ 
The familiar ratio test yields simpler sufficient conditions. Thus sup^ ^r{t) < oo for all r > 1 if 

limsuppj+i/pj < 1, 

while limt^oo ^r{t) = oo for all r > 1 if 

lim Pj+i/pj = 1. 

For instance, for pj = cq\ the geometric distribution with < g < 1, we have pj+i/pj = q; hence 
sup( ^r{t) < oo for all r, and normal approximation is not adequate for any r. This illustrates possibility 4 
in Theorem 14.11 For the Poisson distribution pj — cX^ /j \ , we even have pj+i/pj — * 0, and so normal 
approximation is no good here, either. 

Continuing this line, we obtain a further set of conditions. 



Lemma 4.3 (a) Suppose for some < A < 1 



j^oc Pj 

for every h > 1. Then ^^(i) ^ oo as t oo for all r > 1. 



liniinf^>A (4.11) 
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(b) The condition limsupj^^o^ 3'r(i) < oo holds for some (hence for all) r > 1 if and only if there exists 
h > 1 such that 

limsup^<i (4.12) 

j^oo Pj ^ 

Proof. For part (a), assume that voiXx, x) = #{j : Xx < pj < x} ^ oo a.s x ^ 0. Then also 



*r(l/a;)> y e~P^/^pJxY/rl>i^n{Xx,x) mm [e^^'yVr!] 

, , ^ ^ , {y:X<y<l} 
{]:Xx<pj<x} 



As X decreases, the piecewise-constant function vq^Xx^x) may have downward jumps only at the values 
X e {pj}, hence the assumption is equivalent to h'o{Xpj,pj) oo (as j oo), which in turn is readily 
translated into (14. lip . 

For part (b), the same estimate with any < A < 1/2 shows that the condition (|4.12p is necessary. 
In the other direction, suppose that pj+h/pj < 3/4 for all j > J. Split > J) into h subsequences 

{j)j+s+ih,i ^ 0), with Q < s < h — 1. Each of the subsequences has the property that the ratio of any 
two consecutive elements is at most 3/4. Hence, as above, the sum of the terms e~P^*{tpjY /r\ along a 
subsequence yields a uniformly bounded contribution to O 

Examples of irregular behaviour of moments may be constructed by breaking the sequence {pj, j > 1) 
into finite blocks of sizes mi,m2, ■ ■ •, and setting the pj's within the i'th block all equal to some g^. We 
use the notation V{t) :— Var {J2r>i ^r{t)) to denote the variance of the number of occupied boxes. 

Example 1. [51 p. 384]. Take nii = i and q.^ = c2~^', with c a normalizing factoiQ to achieve J^j Pj = 1- 
Then both V{t) and $i(t) oscillate between and oo, approaching the extremes arbitrarily closely. This 
illustrates possibility 3 in Theorem 14. II 

Example 2. As in [31 Example 4.4], take = 2^', = c2~^'^\ Then $i(i) oo, but ^2{t) oscillates 
between and oo as t varies; thus Yi{t) is asymptotically normal, but Y2{t) is not, and the ratios pj+i/pj 
have accumulation points at and 1. This illustrates possibility 2 in Theorem 14. II 



We now extend this example, showing among other things that one can have any value for tq in behaviour 2 
in Theorem 14. II 



Proposition 4.4 Fix < /3 < 1 and a > 0, and take the blocks construction with rrii — [2^^ ''^ 'j, 
qt = cm^ '■^^"-'j where c is the appropriate normalizing constant. Then we have 

(i) limsupf^QQ Vr{t) = oo for all r >1; 

(ii) limt^oo Vr{t) — oo if and only if rj3{\ + a) < 1; 

(iii) limj^oo Pr.j ~ oo if and only if r/3(l + a) < 1; 

(iv) The quantities Srs(0 do not converge for any r ^ s. 
Proof. Once again, we work with $r instead of Vr , now writing 

$,(i) = ^m,e-*«'(%)7r!. (4.13) 

i>l 

For part (i), it is enough to consider the subsequence ti := 1/qi, I > 1. 

For part (ii), split K_|_ into intervals Ji := [qf^,q'i'_^^), I > 1; we show that lim; ^ oo inf tgj, $r(^) = oo 
if r/3(l + a) < 1, and exhibit a subsequence {t'l, I > 1) with tj e J; such that lim/^oo ^r(i/) = if 
r/3(l + a) > 1. Indeed, for t E Ji, taking just the term with i = I + 1 in ()4.13p . we obtain 

/ (l-/3)(l+a)\ 

mi+ieyip{-(f)qi+i/qi}{(j)qi+i/qiY/r\ x m;+i(/)'' I — ) = <f>'^'m]~^'^^'^^°'\ 



^In fact, the Poisson sampling model makes sense for arbitrary Pj's, and the enumeration of small counts makes sense if 
Ej Pj < oo. 
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where we write t = 4>/qi with 1 < 4> < qi/qi+i ^ mf^^^"\ and use the fact that (pqi-^-i/qi < 1 in this 

range. For r/3(l + a) < 1, it follows that inftgj, ^ m\^l^^^^°''^ — > oo as Z ^ oo. 

For r/3(l + a) = 1, take also the term with i = I in (|4.13p . giving a combined contribution of at least 

r\ 

for some > 0. It is easily checked that the minimum value of this sum for <f) > 1 goes to oo with ?, 
hence, once again, lim;^oo infteJi ^r-(i) = oo. 

For r/3(l + a) > 1, these two terms contribute an amount of order 

<^'-{m(i/)e-^ + m,Vr(i+")}, (4.14) 

to (|4.13p . which is small as / — > cxd, for example, for (p ~ 21ogTO;+i. The sum of the terms in (|4.13p for 
i > / + 2 is of order 

i>l+2 \ 1'- / i>l+2 

where 77 > 0, and hence asymptotically smaller than the second clement of (|4.14p . The sum of the terms 
in (|4.13p for i < / — 1 is of order at most 

( i-i 



'^m, \ exp{-(l)qi^i/qi} (J-^^ 



largest for = 1 for all I large enough, when it is of order 

asymptotically small as I 00. Hence, for = 2qf^ logmi+i, it follows that Wmi^oo ^rit'i) = 0, and 
therefore that does not converge to infinity as t ^ 00. 

For part (iii), writing Mi := Ym=i "^'i have 

Pr,j > qj^^ nMqi whenever Mi_i < j < Mi, 

l>i+l 



with equality for j — Mi. Now 

Er ^ l-r(l+a) 

l>i+l 



and 

-r r(l+a) r(l-/3)(l+a) 

Hence Pr,Aii ^ ml_^_l^^^^°''' is bounded for r(3{l + a) > 1, and prj ^ (X) as j — > 00 if r/3(l + a) < 1. 
For part (iv), we note that, for t = 4>/qi, the quantity 

^' \ r J ^VAt)Vs{ty ^ 

behaves asymptotically, as I becomes large, in the same way as for the Poisson occupancy scheme with a 
single block of m; boxes with equal frequencies qi. Computing the limit, 

lim T^rsii'/qi) = 



where mi cancels because of the additivity of the moments. As varies, this limit value varies too, and 
hence, for r ^ s, the quantities Ers(i) do not converge as t ^ 00. □ 
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It follows from parts (ii) and (iii) of Proposition 14.41 that the implication in part (b) of Lemma 14.21 
cannot be reversed, and from part (iv) that the correlations between different components of Y{t) need 
not converge, even when their variances tend to infinity. Hence the approximation in Theorem 13.11 does 
not necessarily imply convergence. Yet another kind of pathology appears when Yi{t) is asymptotically 
independent of {Yj.{t),r > 1), as in the following example. 

Example 3. Suppose that the frequencies in the block construction satisfy Qi = 1/?!, = (i — 2)! (with 
i > 2). Since q^^ "^/c9fc ~^ ^ for each r, we have limj_^oo Pj,r = oo, and hence all the variances 

Vr{t) go to oo by Lemma 1121(b). On the other hand, m^qi ^ J2'kLi+i ''^klk — > 0, and it follows that 

^i+s(2t) ^ 2^+i^^m,g,e-*g-{e-*'?-t^gf} ^ ^ 
$i(t) (s + 1)! X;.TO^g^e-*«' 

as t ^ oo. Since $i+s(2i)/$s(i) is bounded above by (|4.4|) . we conclude that Si^s(0 ^ for s > 2. It 
follows that every pair {Yl{t),Y!.{t)), s > 2, converges in distribution to the standard bivariate normal 
distribution with independent components. Because the variances go to oo. Theorem 13.11 guarantees 
increasing quality of the normal approximation for any finite collection of components Y^_{t). However, 
the full vector (F/, r = 1, 2, . . .) does not converge: see more on this example in Sections 5 and 6. 

Part (ii) of Proposition 14.41 also demonstrates that liminf^^ooPj+i/Pj = does not exclude that 
$r(i) oo, hence the condition (|4.1ip in Lemma [L3l is not necessary. Finally, by [31 Eqn. 3.1], we have 

i<fi(20 < V{t) < $i(t), 

meaning that $i(i) is always of the same order as the variance of the number of occupied boxes V{t). 
The examples above show that this need not be the case for $r(Oi when r > 2. 



5 Regular variation 

We now henceforth assume that ^r{t) oo for all r > 1. The CLT for each component of Yt then holds, 
as observed above, and normal approximation becomes progressively more accurate for the joint distri- 
bution of any finite collection of components. A joint normal limit for any collection of the standardized 
components also holds, provided that the corresponding covariances converge. From ()4.3p we have 



Cov(r;(t),r;(t)) = s],,,(i) = c(r,.)^^^^, r^s. (5.1) 

\/ Vr{t)Vs[t) 

The RHS converges to a nonzero limit for each pair r, s if, for each r, <I>r ~ / G Ra, where Ra denotes 
the class of functions regularly varying at oo with index a, and where, here and subsequently, we write 
a w 6 if a{t)/b{t) — > c as t — > oo with < c < oo. If e Ra, then the index belongs to the range 
< a < 1, because $r(0 cannot converge to 0, and because ^r{t)/t 0. 

The results in the next section show that, if the covariances converge for a sufficiently large set of 
pairs r, s, then this is in fact the only possibility. More formally, we say that then regular variation holds 
in the occupancy problem, meaning that, for some < a < 1 and some rate function f G Ra, 

~ f for aU r > 2 . (5.2) 

This setting of regular variation extends the original approach by Karlin [S' in the special case a — 0, 
and, moreover, it covers all possible limiting covariance structures (Theorem 16. 4|) . 
Observe that the functions t~^^r satisfy 

^{t-i<I>i(t)} = (-l)V!{<-^$,.(0}, (5.3) 

thus, in particular, they are completely monotone. This taken together with the standard properties of 
regularly varying functions P| implies that, if $r G Ra for some < a < 1 and r > 1, then the same is 
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true for all r > 1, and we can choose the rate function / = $i. The case a = 1 is special. If $r € 
for some r > 2, then all <I>r for r > 2 are of the same order of growth and $i € but $i 3> ^2 (this 
motivates the choice r > 2 in (|5.2p V 

A necessary condition for (|5.2|) is limj^oo Pj+i/Pj = 1, as follows from the next lemma. 

Lemma 5.1 // limiidj^oo Pj+i/Pj < 1 then $r is not regularly varying for r > 2, and $1 is not 
regularly varying with index a < 1. 

Proof. We have 

=^e-*f^p| = / e-*V2(dx) 

with f2[0, x] := J2jLiP^ — 2;]. Suppose t~^^t G R-p, then 1 < /3 < 2 and, by Karamata's Tauberian 
theorem, also f2[0, G Because /3 7^ 0, the latter imphes that i'2[at~^ ,bt~^] G i?-/3, i.e. that 

i^2[at-\bt-^] - (&^^ - a^)£(i)i-^, t 00 (5.4) 

for any positive a < b. However, the assumption of the lemma allows to choose a < 5 < 1 such that 
V2[0'Pj, bpj] — for infinitely many j = jk, so (|5.4p fails for t — 1/pj^. —> 00. The contradiction shows that 
t^^$2(t) cannot be regularly varying. The assertions regarding r ^ 2 can be derived in the same way. □ 

The example below shows that $r may be regularly varying for r — 1 alone. 
Example 3 (continued). Let g{t) — i^i[0,t^^] ~ J2'jLiPj MPj — ^^^]- We have the general estimates 

t-'^i{t) > e-'g{t) 

and, for a > 1 and any e > 0, 

t-^^i{t) - {at)-^^i{at) 

00 

< eg{at/e) + {g{t/ log{l/eg{t)}) ~ g{at/e)} + ^^,6"*^^ l[pj > t'^ log{l/e5(t)}] 

< 2eg{t) + {g{t/ log{l/eg{t)}) - g{at/e)}. 

Applying these to the block construction with qi = 1/il and nii = [i — 2)!, we observe that g{t) x I{t)^^ 
and that g{t/ log{l/ eg (t)}) — g{at/e) involves at most two q^, each of the corresponding terms being of 
the order of I{t)^^, where I{t) := min{i : i! > t}. It follows that t~^^i{t) e Rq, whence $1 e i?i and 
$1 3> for r > 2. However, qi^i/qi 0, therefore Lemma [5.11 implies that ^ ^1 for r > 2. 

The proper case of regular variation with index < a < 1 can be characterized by Karlin's condition 
[SI Equation 5] 

z/o[x, 1] := #{j : pj >x}^ t{llx)x~", x [ 0, (5.5) 

where and henceforth the symbol t stands for a function of slow variation at 00. Other equivalent 
conditions are (see [B]) 

m := f\l-e-'-)uo{dx)^T{l~a)t^l{t), 
Jo 

Oi 

i^r[0,x] ~ ^''^"^(l/x) for some r > 1, 

r — a 

^r{t) - ,~ f^lit) for some r > 1, 

P2 - r(j)r'/", 
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where i*{y) = l/{i^^°'{y^^°')}'^ , and # denotes the de Bruijn conjugate of a slowly varying function [2]. 
Note that Vr{t) then has the same order of growth, in view of (14. 2p . yielding behaviour as in possibility 1 
of Theorem O The joint CLT for 

in M.°° holds with the limiting covariance matrix S computable from (|4.3p as 

aT{r + s — a) 



r!s!2''+-'"" 



r ^ s 



a 



Srr = -r ~ 



r(2r - a) 



r! V j,|22r-a 

in accord with Karlin [51 Theorem 5]. 

If (|5.5p holds with a = 1 then must approach as t ^ oo sufficiently fast to have J^Pj < 
In this situation we have ^r{t) ^ (r^ - r)~^£{t)t for r > 1 but ^i{t) ^ li{t)t with some li > t. In fact, 
Xn,i ~ -fi'n as n ^ oo almost surely. Because the scaling of Yi{t) is faster than that for other Yr{tys^ it 
follows from ((i?^ and that Eir(t) ^ for all r > 2, so that the CLT holds with F/(t) asymptotically 
independent of (y/(t),r > 2). The limiting covariance matrix of {{Yr — r >2} \s obtained 

by setting a = 1 in the above formulas for S. Our multivariate result extends in this case the marginal 
convergence that was stated in [H Thm 5'jl. 

Karlin's condition ()5.5|1 with a = is too weak to control the <I>r(t)'s. However, a slightly stronger 
condition 

1^1 [0,x] := ^ (5.6) 

is equivalent to <i>r G for any (and hence for all) r > 1. To illustrate the difference, note that in 
the geometric case, with pj = (1 — q)q-'^^, < 9 < 1, we have £{l/x) ^ logg(l/x), whereas i^i[0,x] = 
ql^oSqi^n/ii-q))] jg jjqi^ regularly varying, since vi[0, x]/x jumps infinitely often from (1 — q)^^ to q{l — q)^^ 
as X ^ 0. The geometric case can be contrasted to the one with frequencies pj = ce^^'^ (0 < /3 < 1), for 
which we have ^(l/cc) ^ c\ loga;|^ and vi[0,x\/x ~ c| logxj''""'^. 
By [21 Prop. 15], the general connection between £1 in (|5.6p and 

t{l/x) ^ [ u-^ei{l/u)du, 0<x<l. 



Adopting (|5.6p we have ^'^[0,2:] ^ r^^x'^£i{l/x), r > 1, and the situation is then very similar to that in 
the proper case: we have ^r{t) ^ r~^£i{t) and {{Yr{t) — $,-(i))/-\/^i(i), r > 1}, converges in law to a 
multivariate Gaussian limit with covariance matrix S given by 

1 1 ^2r\\ _ 1 fr + s\ 



This applies, for instance, to the frequencies pj ce (0 < /3 < 1). This case of slow variation seems 
not to have been considered before. 



6 Convergence of the covariances 

We will show in this section that regular variation is essential for the multivariate convergence of the whole 
standardized vector of counts, so that all possible limit covariance structures are those characterized in 
the previous section. Our starting point is the following lemma, which asserts that the regular variation 
is forced by the convergence of the ratios of 's. 

^ One example is pj = c/j{log(jr + 1)}''+'^, /3 > 0, in which case £{t) ~ l/c(log . 

^ Mikhailov 1121 indicated yet other situation where the X„^r's for r > 1 all behave similarly, but their behaviour is 
distinct from that of X„_i. 
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Lemma 6.1 Suppose for some r > 1 

lim ^r+l{t)/'^r{t) = c. (6.1) 
t — ^oo 

Then (r — l)/(r+l) < c < r/{r + I) and G Ra with a :— r — c(r +1). Moreover, we then always have 

lim = — ^ 6.2 

and $s G i?Q /or all s > 1, unless a = 1. If (|6.ip /lo^rfs wif/i r > 1 and c=(r — l)/(r + l), then G 
for s >2, and ()6.2p js s<«/Z true ("m particular, $i ^ $2/ 

Proof. A monotone density result which dates back to von Mises and Lamperti [2] says that the con- 
vergence tg'{t)/g{t) — > P imphes g e i?^ (this holds for arbitrary /3, including ±00). This result applied 
to g{t) = t^'''^r{t) yields the regular variation G Ra , with some < a < 1. The rest follows from 
(|5.3p . monotonicity and the general behaviour of the regularly varying functions under integration and 
differentiation [2J. □ 

To apply the lemma, we need to pass from the convergence of covariances ()5.ip to the convergence of 
a ratio as in (16.111. To this end, it is useful to exclude zero limits. 



Lemma 6.2 // limsup^ $s(t) = 00 for any s > 1, then no correlation 'E,r,r'{t) with 2 < r < r' can 
converge to zero. 

Proof, (i) Let := : 2~(^+^) < Pi < 2"-'}. Then, if m* := sup^ nij < 00, it follows that, for 
2i <t< 2i+\ 

s!$,(i) = ^ itpiYe-'P' 

k>0 {/:2-<'= + l)<p,<2-'=} 

< ^TOfc2(^+i-'=)"exp{-2J'-'=-i} 

k>0 

< TO*( ^ 2(^+1^'-^) +2^^2''(^-'=)exp{-2J-'^-i} 

fc>j+l fe=0 

< TO*(2 + 2*^2'''exp{-2'-i}) = m*Cs < 00, 



uniformly in j, which contradicts limsupj ^ sit) ~ 00. Hence sup^ mj — 00. 

(ii) Given any jo, there exists some j > jo such that 

mk < mj, < k < j; ruk < 'd^ ^'ruj, k > j. (6-3) 

To see this, first take ji > jo such that nij-^ — max{mfe, < k < ji], as can always be done, since 
supj nij = 00. Then let j2 := maxjfc > ji : > S'^^-'^m^j}; this is finite, since 1 > X]/>iPi ^ mj2~(^^^) 
for each j > 0. Finally, take = argmaxjj<j<j2 nij] then js satisfies the requirements of ()6.3p . 

(iii) Now suppose that j satisfies (j6.3l) . Then, much as in part (i), for any r > 2, 

r!$^(2J) < ^mfc2(J-'=)''exp{-2^-'=-i} 

< (J2 TOi3'=-^2''(J-'=) +mj^2''(^-'=)exp{-2^'-'=-i} 

k>j + l k=0 

< mj(3 + ^2''-exp{-2'-i}) = ^m^, 

l>0 
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with < cxD, whereas also, just from the indices I with 2 W+i) < pi < 2 ^ , we have 
This impUes that 

<i>.+..(2i)/v<i>.w$.'W > , \ 7, > o 

for t = 2^ , whenever j satisfies the requirements of ()6.3p , and there are infinitely many such. Hence the 
correlations Sr,r'(i) with r' > r > 2 cannot converge to zero. □ 

Note that the correlations 'Ei g{t), s > 1, converge to zero in the case of regular variation with index 
a = 1. Example 3 illustrates that Ei ^(i) may also converge to zero when regular variation in the sense 
of l(0)) does not hold. 

Lemma 6.3 If g is continuous and positive, and g{2t) / ^ g(t) ^ k as t —>■ oo, with < k < oo, then 
g{t)^k\ 

Proof. Given e > 0, let be such that g{2t) < (1 + e)g{t) for all t > t^. Let :— sup^gj g{t), 
where Jg :— \t^, 2t^]. Then, for all t E and all n > 0, we have 

g{2-t) < {k\l + e)y-'-"{g{t)r-" < k\l + e)Kr ■ 
Thus limsup(5(t) < k^. A similar argument shows that liminf^ > A;^, proving the lemma. □ 

Theorem 6.4 Suppose the correlations J^r,s{t) converge, as t ^ oo, for r,s satisfying 2 < r < s and 
r + s < 12. Then the following is true: 

(i) (|5.2p holds with some < a < 1, 

(ii) the correlations Sr.s(i) converge for all r, s, 

(iii) r = 1, 2, . . .) converges weakly to one of the multivariate normal laws described in Section[5j 

(iv) the same multivariate normal limit holds for the normalized and centred . 

Proof. For short, write Vj = Vj{t), fj = ^j(t) and Fj = $j(2i). 

By Lemma l6.2| the l^r,sit) converge to nonzero limits, whence, for r, s in the required range, 

VVrVs ^ V"^r.+ll4-l 

and hence VrVs « Vr+iVg-i. From this, V5 « V3V4/V2, Vq « V3V5/V2 ~ V^V4,/V2, and substituting in 
V2V6 ~ V3V5 we get V4/V2 ~ (V3/V2)^. Continuing in this way yields 

i-2 

for 2<j< 10. (6.4) 
From this and Fj w V2VJ-2, we obtain 

for 5 < j < 12. (6.5) 
Substituting jO]) and ([631) in = Vj + Cji^2j (recall (g^l)) yields 

for 3 < j < 6. (6.6) 
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This offers two ways of expressing Fj for j = 5,6: using ()6.5|) or (|6.6p . but with the argument 2t for 
the latter. The first gives 

and the second gives 



\V2{t)J ' "\V2{t) 



^6 /^3W\'^' ^^3(2t) 



It foUows that 

F5 ' \V2{t)J ' V2(2t)- 

Applying Lemma [6.31 to g{t) = V^(t)/V2(t) shows that this must converge, hence from (|6.6p the ratio 
^4{t) /^^{t) must converge too. Parts (i), (ii), (iii) of the theorem now follow from Lemma [01 and part 
(iv) follows by de-Poissonization. □ 
Combining Theorem 16.41 and Lemma l5.1l we arrive at a very simple test for the convergence, which is 
easy to check in the examples of Section 4: 

Corollary 6.5 The condition linij^oo Pj+i/Pj — ^ is necessary for the convergence of the (normalized 
and centred) Xn to a multivariate normal law. 

It should be stressed that the condition is by no means sufficient. For instance, the frequencies pj — 
c{2 + sin(log satisfy Pj+i/pj — > 1 but do not have the property of regular variation due to the 

oscillating sine factor. Thus in this case X„ has no distributional limit. 
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